Coding with Jesse

Goldilocks and the Three Developers

Goldilocks was the lead of a software development team. She needed to review pull requests from three of her team members.

The first developer's code was a mess. It relied on some deprecated features of an outdated library. The few modules were long and complex, trying to do too many different things. There were no tests, so it was impossible to be sure the code was bug-free. The architecture needed to run on a single server, so it could never scale up. There was no way to know whether it did what it was supposed to do.

The second developer's code was also a mess. The system was built using some brand new libraries and coding paradigms. The system comprised of a dozen different interconnected microservices. There was a very thorough test suite, testing every implementation detail. The system included infrastructure as code, but couldn't run on a single computer. There was no way to know whether it did what it was supposed to do.

The third developer's code was just right. It used the latest versions of libraries the team was familiar with. The system was split up into a dozen simple modules. It was obvious what each module did, and how it fit within the business requirements. There were a few tests for the core functionality, so she knew that it was working. The system was easy to get running, but was simple enough to scale up infinitely. It was very easy to understand, and to know that it did what it was supposed to do.

Goldilocks then had a meeting with the three developers.

She told the first developer their code was under-engineered. She said they should take some time to simplify it and make it easier for other developers to understand and work with it.

She told the second developer their code was over-engineered. She said they should take some time to simplify it and make it easier for other developers to understand and work with it.

She said well done to the third developer and approved the pull request.

Published on June 5th, 2024. © Jesse Skinner

Unable to locate credentials in AWS

The Problem

If you have servers in AWS doing a high volume of AWS service requests, you may come across some rare but frustrating sporadic credential errors like these:

"Unable to locate credentials"

or if you're using aws-sdk in Node.js:

"CredentialsProviderError: Could not load credentials from any providers"

I'm not totally sure why these errors happen, but typically I see them happen across multiple services, accounts and regions around the same time, which leads me to believe that there can be some sporadic flakiness in the metadata service used for fetching IAM credentials.

I tried using metadata retries and other configuration parameters to prevent this, but they didn't seem to make any difference.

The Solution

Looking for a solution, I found this buried in the AWS documentation for instance metadata retrieval:

"If you're using the IMDS to retrieve AWS security credentials, avoid querying for credentials during every transaction or concurrently from a high number of threads or processes, as this might lead to throttling. Instead, we recommend that you cache the credentials until they start approaching their expiry time."

Now, I don't think this throttling was the source of all the errors I was seeing, but it may be playing a role. Maybe the metadata service tolerance for throttling changes over time as demand changes, I don't know.

Either way, this gave me an idea to write a bash script to cache the IAM credentials in ~/.aws/credentials so they could be used by both the AWS CLI, and also any Node.js or Python clients accessing the AWS services:

#!/bin/bash

IMDS_URL="http://169.254.169.254/latest/meta-data/iam/security-credentials/"
AWS_CREDENTIALS_PATH="~/.aws/credentials"
PROFILE_NAME="default"

# 4.5 minutes, because new credentials appear 5 minutes before expiry
EXPIRY_BUFFER=270

get_aws_credentials() {
    local role_name=$(curl -s $IMDS_URL)
    local credentials_url="${IMDS_URL}${role_name}"
    local response=$(curl -s $credentials_url)

    local access_key_id=$(echo $response | jq -r '.AccessKeyId')
    local secret_access_key=$(echo $response | jq -r '.SecretAccessKey')
    local token=$(echo $response | jq -r '.Token')
    local expiration=$(echo $response | jq -r '.Expiration')
    local expiration_time=$(date -d "$expiration" +%s)

    echo "[$PROFILE_NAME]" > $AWS_CREDENTIALS_PATH
    echo "aws_access_key_id = $access_key_id" >> $AWS_CREDENTIALS_PATH
    echo "aws_secret_access_key = $secret_access_key" >> $AWS_CREDENTIALS_PATH
    echo "aws_session_token = $token" >> $AWS_CREDENTIALS_PATH
    echo "expiration = $expiration_time" >> $AWS_CREDENTIALS_PATH
}

should_fetch_credentials() {
    if [[ ! -f $AWS_CREDENTIALS_PATH ]]; then
        return 0
    fi

    local expiration_time=$(grep 'expiration' $AWS_CREDENTIALS_PATH | cut -d ' ' -f 3)
    local current_time=$(date +%s)

    if (( $current_time + $EXPIRY_BUFFER > $expiration_time )); then
        return 0
    fi

    return 1
}

if should_fetch_credentials; then
    get_aws_credentials
fi

Since the credentials have to be refreshed every few hours, I set it up to run in a cron job every minute, to check if the expiration time has come:

* * * * * /home/ec2-user/credentials.sh > /dev/null 2>&1

Voila! No more credential errors! I hope that helps. Let me know if you've run into the same errors, and if you found this approach useful.

Published on May 30th, 2024. © Jesse Skinner

You don't need permission

Watercolour illustration of a construction worker building a structure

You don't need permission to write the highest quality code you can. You don't need permission to design a reliable server architecture that won't crash. You don't need permission to develop a suite of tests to ensure bugs are caught early. You don't need permission to upgrade your dependencies, to ensure your system stays secure and modern.

Your boss, manager or client will never ask you to take time to refactor your code. They'll never ask you to set up a test suite for the code you wrote. They'll never ask you to upgrade your framework.

It's not that they don't want you to do things to improve the quality of the code. It's because they expect you to write high quality code from the start. They believe you to know what it takes to design a reliable system. They trust you to build web apps that last.

They'll never ask you to make sure your code has no bugs. They'll never ask you to make sure the new system doesn't crash. They'll never ask you to make sure your code will be understood by other developers. It all goes without saying.

So, don't put these things off for later. Don't put writing tests on the backlog. Don't put refactoring on your list of nice-to-haves. Don't put improving the reliability of your servers on the wishlist. Don't wait for permission.

Do these things every day. Make them a part of your process. You don't need permission.

Published on May 22nd, 2024. © Jesse Skinner

Web apps that last

A castle made of sand

When you're building a new web application, or even a new feature, how can you ensure that you're not creating a nightmare code base that will need to be rewritten completely in a few years?

Some people will say it's hopeless to even try and write code that will last. I've even heard people suggest that you should aim to rewrite all your code every few years. That sounds like a very expensive, wasteful strategy.

In two decades of building web apps, I've seen many codebases start as a shiny new prototype and grow into a huge system. In some cases, they've become old, ugly, painful legacy systems that teams are begging to rewrite and replace. (And often those rewrites will themselves grow into ugly, painful legacy systems!) But sometimes a codebase will remain more or less unchanged a decade later, running smoothly as ever.

I believe there are some decisions you can make when writing code that will help it to last longer, and withstand the test of time.

Change is inevitable

Probably the one thing you can be sure of is that change will come. The goals of a business will change, and the people within a business will change. There will inevitably be features added, and existing features will evolve and be repurposed. The names of the products will almost surely change. So is it even possible to write code that doesn't need to change?

I think the key is in the phrase "If it ain't broke, don't fix it". Code that is fulfilling its task, that is doing what it's supposed to do, and is bug-free, is code that will last a long time.

Nightmare code

To understand how to write code that will last, let's think about the opposite: a nightmare codebase that demands to be rewritten. The worst I've seen is a web server written as a giant single file with thousands of lines of code. A system built like a house of cards, where changing one thing will break everything. Code that is very difficult to read or understand. Code that literally gives developers nightmares.

Unfortunately, this is often the kind of code that comes out of throwing together a quick prototype. A hero developer stays up late one night and churns out a first draft like a stream-of-consciousness. The next morning, the business owner is delighted to see their dreams come to life. Everyone's happy.

Then, they ask to change just one thing. Add this little feature. And this other feature. And now this user needs this other thing. And could you just change that other thing quick?

Months later, and this rough draft has accidentally become the foundation for a web application that continues to grow, held together with digital duct tape.

So how do you prevent this nightmare from unfolding?

Do one thing, and do it well

Modularity is extremely important in writing code that will last. And a good module is a piece of code that does one thing, and does it well. That one thing might be interfacing with a single database table. Or it could be handling HTTP calls on a single URL and passing the data to and from other modules that talk to the database.

I find generally that it works best when each module has zero, one or two major dependencies. With zero dependencies, you have a set of functions that receive input data, process it in some way, and return results. With one dependency, you have a set of functions that act as an abstraction or interface to that dependency. With two dependencies, you're writing code that bridges the gap between the two, acting as an adapter or controller.

More than two major dependencies, and you should ask yourself if there's any way to split things up into smaller pieces that are responsible for fewer things.

A dependency might not always be a program you have to install. Another module in your system is also a dependency. So is your business logic. I think about dependencies as anything that your module "knows about". This could even be the shape of certain data structures that might not have explicit type definitions.

The fewer things a module knows about, the more likely the module will be able to persist unchanged over time, because there will be fewer reasons to change it.

When your web application is built with small, independent modules that only do one thing, the chances are much, much lower that any of those pieces will need to be rewritten. And the chance of the whole application needing to be rewritten all at once drops to nearly zero. Even if you later want to do a major redesign, you'll find it easier to copy over lots of these older, simple modules to reuse in the new system.

Finally, a tangible example

Let's say you need to send out a Forgot Password email. You could do the whole thing in one file, but I would prefer to split it up like this:

  1. A module that knows how to actually send an email using AWS SES or something, but doesn't know the recipient, subject or body of the email. function sendEmail(toAddress, subject, body), for example.

  2. A module that knows about the subject and body of the Forgot Password email, but doesn't know who it's sending to or what the reset URL will be. function sendForgotPasswordEmail(toAddress, resetUrl)

  3. A module for the user table in the database, that has a function to generate a reset code, but doesn't know how the reset code will be used or even whether an email will be sent out. function createResetCode(userEmail)

  4. A module that knows about the URL structure of the site, and has a function that can generate a password reset link from a reset code. function getResetUrlFromCode(code)

  5. A module that ties everything together. It takes an email address, calls createResetCode, uses that to call getResetUrlFromCode, passes that to getForgotPasswordEmail, and sends the recipient address and email body to sendForgotPasswordEmail. forgotPassword(email)

  6. A user interface widget with a form, a text field and a button, so the user can type in their email address and click Send password reset link. When the form is submitted, it tells the user to go wait for the email.

  7. A module that is responsible for the server-side password reset part of the system. It receives the form submission, pulls the email address from the form data, calls forgotPassword, and then sends a success status back to the browser.

Here, only a few modules are likely to change. You'll probably see changes to the sendForgotPasswordEmail function, as well as the user interface widget. All the other modules I've outlined are very reusable, and highly unlikely to change, unless you change your email sending provider, or your database software, or something else major. Even in those situations, the code that needs to change is very isolated and easy to replace without affecting anything else.

You can even improve on this further, by having the contents of the email be database-driven, so that non-technical staff members can change the email templates themselves through an admin interface. But an architecture like this is a good starting point that makes those sorts of changes simpler to make.

A good start

If you get in the habit of writing more modular code, and splitting things up as early as possible, then the next time you're throwing together a quick prototype, you'll be able to lean on those principals in the process.

Instead of a giant ball of tangled dependencies and logic, you'll be building smaller, simpler, reusable components that can be used as solid building blocks. Some of these will be so useful and generic that you'll even be able to reuse them in completely different systems without changing them at all.

Published on February 19th, 2023. © Jesse Skinner